Speeding-up one-versus-all training for extreme classification via mean-separating initialization
نویسندگان
چکیده
Abstract In this paper, we show that a simple, data dependent way of setting the initial vector can be used to substantially speed up training linear one-versus-all classifiers in extreme multi-label classification (XMC). We discuss problem choosing weights from perspective three goals. want start region weight space (a) with low loss value, (b) is favourable for second-order optimization, and (c) where conjugate-gradient (CG) calculations performed quickly. For margin losses, such an initialization achieved by selecting it separates mean all positive (relevant label) instances negatives – two quantities calculated quickly highly imbalanced binary problems occurring XMC. demonstrate speedup $$5\times$$ 5 × on Amazon-670K dataset 670,000 labels. This comes part reduced number iterations need due starting closer solution, implicit negative-mining effect allows ignore easy CG step. Because convex nature optimization problem, without any degradation accuracy. The implementation found at https://github.com/xmc-aalto/dismecpp .
منابع مشابه
Speeding up ResNet training
Time required for model training is an important limiting factor for faster pace of progress in the field of deep learning. The faster the model training, the more options researchers are able to try in the same amount of time, and the higher the quality of their results. In this work we stacked a set of techniques to optimize training time of the ResNet model with 20 layers and achieved a subs...
متن کاملSpeeding up the binary Gaussian process classification
Gaussian processes (GP) are attractive building blocks for many probabilistic models. Their drawbacks, however, are the rapidly increasing inference time and memory requirement alongside increasing data. The problem can be alleviated with compactly supported (CS) covariance functions, which produce sparse covariance matrices that are fast in computations and cheap to store. CS functions have pr...
متن کاملSpeeding Up Dijkstra ’ s Algorithm for All Pairs Shortest Paths ∗
We present a technique for reducing the number of edge scans performed by Dijkstra’s algorithm for computing all pairs shortest paths. The main idea is to restrict path scanning only to locally shortest paths, i.e., paths whose proper subpaths are shortest paths. On a directed graph with n vertices and m edges, the technique we discuss allows it to reduce the number of edge scans from O(mn) to ...
متن کاملSpeeding up computations via molecular biology
We show how to extend the recent result of Adleman 1] to use biological experiments to directly solve any NP problem. We, then, show how to use this method to speedup a large class of important problems.
متن کاملSpeeding up Training with Tree Kernels for Node Relation Labeling
We present a method for speeding up the calculation of tree kernels during training. The calculation of tree kernels is still heavy even with efficient dynamic programming (DP) procedures. Our method maps trees into a small feature space where the inner product, which can be calculated much faster, yields the same value as the tree kernel for most tree pairs. The training is sped up by using th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Machine Learning
سال: 2022
ISSN: ['0885-6125', '1573-0565']
DOI: https://doi.org/10.1007/s10994-022-06228-2